Language-independent hybrid MT with PRESEMT

نویسندگان

  • George Tambouratzis
  • Sokratis Sofianopoulos
  • Marina Vassiliou
چکیده

The present article provides a comprehensive review of the work carried out on developing PRESEMT, a hybrid language-independent machine translation (MT) methodology. This methodology has been designed to facilitate rapid creation of MT systems for unconstrained language pairs, setting the lowest possible requirements on specialised resources and tools. Given the limited availability of resources for many languages, only a very small bilingual corpus is required, while language modelling is performed by sampling a large target language (TL) monolingual corpus. The article summarises implementation decisions, using the Greek-English language pair as a test case. Evaluation results are reported, for both objective and subjective metrics. Finally, main error sources are identified and directions are described to improve this hybrid MT methodology.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

PRESEMT: Pattern Recognition-based Statistically Enhanced MT

This document contains a brief presentation of the PRESEMT project that aims in the development of a novel language-independent methodology for the creation of a flexible and adaptable MT system.

متن کامل

Evaluating the Translation Accuracy of a Novel Language-Independent MT Methodology

The current paper evaluates the performance of the PRESEMT methodology, which facilitates the creation of machine translation (MT) systems for different language pairs. This methodology aims to develop a hybrid MT system that extracts translation information from large, predominantly monolingual corpora, using pattern recognition techniques. PRESEMT has been designed to have the lowest possible...

متن کامل

Comparing CRF and template-matching in phrasing tasks within a Hybrid MT system

The present article focuses on improving the performance of a hybrid Machine Translation (MT) system, namely PRESEMT. The PRESEMT methodology is readily portable to new language pairs, and allows the creation of MT systems with minimal reliance on expensive resources. PRESEMT is phrase-based and uses a small parallel corpus from which to extract structural transformations from the source langua...

متن کامل

Implementing a Language-Independent MT Methodology

The current paper presents a languageindependent methodology, which facilitates the creation of machine translation (MT) systems for various language pairs. This methodology is implemented in the PRESEMT hybrid MT system. PRESEMT has the lowest possible requirements on specialised resources and tools, given that for many languages (especially less widely used ones) only limited linguistic resou...

متن کامل

Expanding the Language model in a low-resource hybrid MT system

The present article investigates the fusion of different language models to improve translation accuracy. A hybrid MT system, recentlydeveloped in the European Commissionfunded PRESEMT project that combines example-based MT and Statistical MT principles is used as a starting point. In this article, the syntactically-defined phrasal language models (NPs, VPs etc.) used by this MT system are supp...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2013